Consolidate all dependencies into a single indexed file (to reduce I/O)#704
Consolidate all dependencies into a single indexed file (to reduce I/O)#704
Conversation
There was a problem hiding this comment.
Benchmark Index (enterprise)
Details
| Benchmark suite | Current: 57320a2 | Previous: 6d786dc | Ratio |
|---|---|---|---|
Add one schema (0 existing) |
37 ms |
48 ms |
0.77 |
Add one schema (100 existing) |
237 ms |
300 ms |
0.79 |
Add one schema (1000 existing) |
2453 ms |
2827 ms |
0.87 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Benchmark Index (community)
Details
| Benchmark suite | Current: 57320a2 | Previous: 6d786dc | Ratio |
|---|---|---|---|
Add one schema (0 existing) |
35 ms |
42 ms |
0.83 |
Add one schema (100 existing) |
241 ms |
280 ms |
0.86 |
Add one schema (1000 existing) |
2493 ms |
2852 ms |
0.87 |
This comment was automatically generated by workflow using github-action-benchmark.
5b26980 to
95664a3
Compare
Signed-off-by: Juan Cruz Viotti <jv@jviotti.com>
deps.json file
🤖 Augment PR SummarySummary: This PR experiments with consolidating build dependency tracking into a single output-level file ( Changes:
Technical Notes: 🤖 Was this summary useful? React with 👍 or 👎 |
| dependency = (this->root / dependency).lexically_normal(); | ||
|
|
||
| const char tag{contents[position]}; | ||
| const std::string_view value{contents.data() + position + 2, |
There was a problem hiding this comment.
Parsing deps.txt lines via value doesn’t strip a trailing \r, so CRLF files can end up with keys/paths that include \r and won’t match later lookups. This is especially likely on Windows where text-mode writes typically produce CRLF.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| std::stoll(std::string{ns_part})}; | ||
| const auto mark_value{mark_type{ | ||
| std::chrono::duration_cast<mark_type::duration>(nanoseconds)}}; | ||
| this->marks.insert_or_assign( |
There was a problem hiding this comment.
Marks loaded from deps.txt are cached even if the corresponding output file no longer exists; later mark() can return this cached value and build() may skip regenerating a missing destination. That can cascade into failures like Output::track() asserting the file exists even though it was deleted between runs.
Severity: high
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| std::ofstream stream{deps_path}; | ||
| assert(!stream.fail()); | ||
|
|
||
| for (const auto &entry : this->dependencies_map) { |
There was a problem hiding this comment.
flush_dependencies() iterates dependencies_map/marks without taking the associated locks; if any thread is still calling write_dependencies()/refresh(), this becomes a data race/UB. If the method is intended to be thread-safe, it likely needs shared locking (or an explicit contract that all work is joined first).
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| dependencies_stream.flush(); | ||
| dependencies_stream.close(); | ||
| sourcemeta::core::flush(dependencies_path); | ||
| stream.flush(); |
There was a problem hiding this comment.
deps.txt is only stream.flush()’d/closed, but it’s no longer passed through sourcemeta::core::flush like the previous per-target .deps writes were. If incremental rebuild correctness relies on this file surviving crashes/power loss, it may need the same durability treatment.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| // Output files should always have their marks cached | ||
| // Only input files or new output files are not | ||
| assert(!this->has_previous_run || | ||
| !path.string().starts_with(this->root.string()) || |
There was a problem hiding this comment.
The path.string().starts_with(this->root.string()) check can misclassify paths that merely share a prefix (e.g., /out vs /out2) and can be brittle with normalization/case differences. That could make this debug assertion fire for non-output files or miss real output-root paths.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
Signed-off-by: Juan Cruz Viotti jv@jviotti.com